29 research outputs found

    An Architecture for Deploying Reinforcement Learning in Industrial Environments

    Full text link
    Industry 4.0 is driven by demands like shorter time-to-market, mass customization of products, and batch size one production. Reinforcement Learning (RL), a machine learning paradigm shown to possess a great potential in improving and surpassing human level performance in numerous complex tasks, allows coping with the mentioned demands. In this paper, we present an OPC UA based Operational Technology (OT)-aware RL architecture, which extends the standard RL setting, combining it with the setting of digital twins. Moreover, we define an OPC UA information model allowing for a generalized plug-and-play like approach for exchanging the RL agent used. In conclusion, we demonstrate and evaluate the architecture, by creating a proof of concept. By means of solving a toy example, we show that this architecture can be used to determine the optimal policy using a real control system.Comment: This preprint has not undergone peer review or any post-submission improvements or corrections. The Version of Record of this contribution is published in Computer Aided Systems Theory - EUROCAST 2022 and is available online at https://doi.org/10.1007/978-3-031-25312-6_6

    Deep Q-Learning versus Proximal Policy Optimization: Performance Comparison in a Material Sorting Task

    Full text link
    This paper presents a comparison between two well-known deep Reinforcement Learning (RL) algorithms: Deep Q-Learning (DQN) and Proximal Policy Optimization (PPO) in a simulated production system. We utilize a Petri Net (PN)-based simulation environment, which was previously proposed in related work. The performance of the two algorithms is compared based on several evaluation metrics, including average percentage of correctly assembled and sorted products, average episode length, and percentage of successful episodes. The results show that PPO outperforms DQN in terms of all evaluation metrics. The study highlights the advantages of policy-based algorithms in problems with high-dimensional state and action spaces. The study contributes to the field of deep RL in context of production systems by providing insights into the effectiveness of different algorithms and their suitability for different tasks.Comment: Submitted and accepted version to the 32nd International Symposium on Industrial Electronics (ISIE), Helsinki, Finlan

    A Modular Test Bed for Reinforcement Learning Incorporation into Industrial Applications

    Full text link
    This application paper explores the potential of using reinforcement learning (RL) to address the demands of Industry 4.0, including shorter time-to-market, mass customization, and batch size one production. Specifically, we present a use case in which the task is to transport and assemble goods through a model factory following predefined rules. Each simulation run involves placing a specific number of goods of random color at the entry point. The objective is to transport the goods to the assembly station, where two rivets are installed in each product, connecting the upper part to the lower part. Following the installation of rivets, blue products must be transported to the exit, while green products are to be transported to storage. The study focuses on the application of reinforcement learning techniques to address this problem and improve the efficiency of the production process.Comment: Submitted and accepted version to the 5th International Data Science Conference (iDSC), Krems, Austri

    BMC Bioinformatics / MAESTRO - multi agent stability prediction upon point mutations

    Get PDF
    Background: Point mutations can have a strong impact on protein stability. A change in stability may subsequently lead to dysfunction and finally cause diseases. Moreover, protein engineering approaches aim to deliberately modify protein properties, where stability is a major constraint. In order to support basic research and protein design tasks, several computational tools for predicting the change in stability upon mutations have been developed. Comparative studies have shown the usefulness but also limitations of such programs. Results: We aim to contribute a novel method for predicting changes in stability upon point mutation in proteins called MAESTRO. MAESTRO is structure based and distinguishes itself from similar approaches in the following points: (i) MAESTRO implements a multi-agent machine learning system. (ii) It also provides predicted free energy change ( G) values and a corresponding prediction confidence estimation. (iii) It provides high throughput scanning for multi-point mutations where sites and types of mutation can be comprehensively controlled. (iv) Finally, the software provides a specific mode for the prediction of stabilizing disulfide bonds. The predictive power of MAESTRO for single point mutations and stabilizing disulfide bonds is comparable to similar methods. Conclusions: MAESTRO is a versatile tool in the field of stability change prediction upon point mutations. Executables for the Linux and Windows operating systems are freely available to non-commercial users from http://biwww.che.sbg.ac.at/MAESTRO.Josef Laimer, Heidi Hofer, Marko Fritz, Stefan Wegenkittl and Peter Lackne

    The pLab Picturebook: Load Tests and Ultimate Load Tests, Part II: Subsequences Report

    No full text
    This is Part II of an exhaustive empirical study on the equidistribution and correlation properties of pseudorandom numbers. Here, we apply the test design introduced in Part I to the analysis of well-chosen subsequences, which occur from splitting a given sequence of pseudorandom numbers. Such a setup is common within parallel applications. Theory predicts extreme sensitivity of linear congruential generators and assures the robustness of inversive methods. We give striking examples of how this can affect a stochastic simulation, i.e. an empirical test. Keywords: pseudorandom number generation, empirical tests, stochastic simulation, subsequence analysis, splitting methods 1 Introduction A well known technique to achieve parallel streams of pseudorandom numbers for stochastic simulation is so called splitting into subsequences, which is also known as leapfrog or jump-ahead. A given generator producing the sequence x 0 ; x 1 ; : : : is split to k parallel streams by setting x (l) ..

    A Generalized [phi]-Divergence for Asymptotically Multivariate Normal Models

    No full text
    I. Csiszár's (Magyar. Tud. Akad. Mat. Kutató Int. Közl8 (1963), 85-108) [phi]-divergence, which was considered independently by M. S. Ali and S. D. Silvey (J. R. Statist. Soc. Ser. B28 (1966), 131-142) gives a goodness-of-fit statistic for multinomial distributed data. We define a generalized [phi]-divergence that unifies the [phi]-divergence approach with that of C. R. Rao and S. K. Mitra ("Generalized Inverse of Matrices and Its Applications," Wiley, New York, 1971) and derive weak convergence to a [chi]2 distribution under the assumption of asymptotically multivariate normal distributed data vectors. As an example we discuss the application to the frequency count in Markov chains and thereby give a goodness-of-fit test for observations from dependent processes with finite memory.distribution of statistics hypothesis testing Markov processes: hypothesis testing (Inference from stochastic processes) asymptotic distribution theory

    Gambling Tests for Pseudorandom Number Generators

    No full text
    This paper extends the idea of serial tests by employing a carefully selected dimension reduction which is equivalent to playing a gambling strategy in a fair coin flipping game. We apply the generalized OE-divergence for testing the hypothesis that the simulated coin is fair and memoryless. An application to Twisted GFSR generators shows the ability of our test to detect deviations from equidistribution in high dimensions. 1 Introduction Numerous tests have been suggested for the empirical quality assessment of pseudorandom number generators (PRNGs), see [5] for an introduction. The standard battery of tests including serial (relative frequency based) tests for overlapping and non-overlapping tuples, and run (permutation based) tests, has recently been extended towards random-walk simulation, [3,13--17]. We somewhat follow this direction and consider a gambling policy in a simple fair coin flipping game. The formulation in terms of gambling being transparent and evident, our test also..

    Empirical Evidence concerning AES

    No full text

    Benefits from Variational Regularization in Language Models

    No full text
    Representations from common pre-trained language models have been shown to suffer from the degeneration problem, i.e., they occupy a narrow cone in latent space. This problem can be addressed by enforcing isotropy in latent space. In analogy with variational autoencoders, we suggest applying a token-level variational loss to a Transformer architecture and optimizing the standard deviation of the prior distribution in the loss function as the model parameter to increase isotropy. The resulting latent space is complete and interpretable: any given point is a valid embedding and can be decoded into text again. This allows for text manipulations such as paraphrase generation directly in latent space. Surprisingly, features extracted at the sentence level also show competitive results on benchmark classification tasks
    corecore